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FIG. 1A 2736 bp 

si G^^GCGCGCC GCCAAGGCCG AGGAGATCCA CCAGAACAAC GAGGAGGAGG 

101 AGGAGGTCGC GGCGGCGTC;- TCCGCCAAGC GCAGCCGCAA GGCGGCATCT 

151 TCCGGGAAGA AGCCCAAGTC GCCCCCCAAG CAGGCCAAGC CGGC-GAGGAA 

201 GAAGAAGGGG GATGCCGAGA TGAAGGAGCC CGTGGAGGAC GACGTGTGCG 

251 CCGAGGAGCC CGACGAGGAG GAGTTGGCCA TGGGCGAGGA GGAGGCCGAG 

301 GAGCAGGCCA TGCAGGAGGA GGTGGTTGCG GTCGCGGCGG GGTCACCCGG 

351 GAAGAAGAGG GTGGGGAGAA GGAACGCCGC CGCCGCCGCT GGCGACCACG 

4 01 AGCCGGAGTT CATC6GCAGC CCTGTTGCCG CGGACGAGGC GCGC?GCAAC 

-51 TGGCCCAAGC GCTACGGCCG CAGCACTGCC GCAAAGAAAC CGGATGAGGA 

=;01 GGAAGAGCTC AAGGCCAGAT GTCACTACCG GAGCGCTAAG GTGGACAACG 

= 51 TCGTCTACTG CCTCGGGGAT GACGTC7ATG TCAAGGCTGG AGAAAACGAG 

601 GCAGATTACA TTGGCCGCAT TACTGAATTT TTTGAGGGGA CTGACCAGTG 

651 rCACTATTTT ACTTGCCGTT GGTTCTTCCG AGCAGAGGAC ACGGTTATCA 

701 ATTCTTTGGT GTCCATAAGT GTGGATGGCC ACAAGCATGA CCCTAGACGT 

751 GTTTTTCTTT CTGAGGAAAA GAACGACAAT GTGCTTGATT GCATTATCTC 

801 CAAGGTCAAG ATAG7CCATG TTGATCCAAA TATGGATCCA AAAGCCAAGG 

851 CTCAGCTGAT AGAGAGTTGC GACCTATACT ATGACATGTC TTACTCTGTT 

901 GCATATTCTA CATTTGCTAA TATCTCGTCT GAAAATGGGC AGTCAGGCAG 

-51 TGATACCGCT TCGGGTATTT CTTCTGATGA TGTGGATCTG GAGACGTCAT 

1001 CTAGTATGCC AACGAGGACA GCAACCCTTC TTGATCTGTA TTCTGGCTGT 

1051 GGGGGCATGT CTACTGGTCT TTGCTTGGGT GCAGCTC7TT CTGGCTTGAA 

1101 ACTTGAAACT CGATGGGCTG TTGATTTCAA CAGTTTTGCG TGCCAAAGTT 

1151 TAAAATATAA TCATCCACAG ACTGAGGTGC GAAATGAGAA AGCCGATGAG 

1201 TTTCTTGCCC TCCTTAAGGA ATGGGCAGTT CTATGCAAAA AATATGTCCA 

1251 AGATGTGGA7 TCAAATTTAG CAAGCTCAGA GGATCAAGCG GATGAAGACA 

1301 GCCCTCTTGA CAAGGACGAA TTTGTTGTAG AGAAGCTTGT CGGGATATGT 

1351 TATGGTGGCA GTGACAGGGA AAATGGCATC TATTTTAAGG TCCAGTGGGA 

14 01 AGGATACGGC CCTGAGGAGG ATACATGGGA ACCGATTGAT AACTTGAGTG 

14 51 ACTGCCCGCA GAAAATTAGA GAATTTGTAC AAGAAGGGCA CAAAAGAAAG 

1501 ATTCTCCCAC TGCCTGGTGA TGTTGATGTC ATTTGTGGAG GCCCACCATG 

1551 CCAAGGTATC AGTGGGTTTA ATCGGTACAG AAACCGTGAT GAGCCACTCA 

1601 AAGATGAGAA AAACAAACAA ATGGTGACTT TCATGGATAT TGTGGCGTAC 

1651 TTGAAGCCCA AGTATGTTCT CATGGAAAAT GTGGTGGACA TACTCAAATT 

1701 TGCGGATGGT TACCTAGGAA AATATGCTTT GAGCTGCCTT GTTGCTATGA 

1751 AGTACCAAGC GCGGCTTGGA ATGATGGTGG CTGGTTGCTA TGGTCTGCCA 

1801 CAGTTCAGGA TGCGTGTGTT CCTCTGGGGT GCTCTTTCTT CCATGGTGCT 

"851 CCCTAAG7AT CC7CTGCCCA CC7A7GATG7 TG7AG7ACG7 GGAGGAGCCC 

-9C1 C7AA7GCCT7 T7CGCAATGT ATGGTTGCA7 A7GACGAGAC ACAAAAACCA 

1951 TCCC7GAAAA AAGCC7TGCT TCTTGGCGAT GCAATTTCAG ATTTACCAAA 

2001 GGTTCAAAAT CACCAGCC7A ACGATGTGAT GGAGTATGG7 GG77CCCCCA 

2051 AGACCGAATT CCAGCGCTAC ATTCGACTCA GTCGTAAAGA CATGTTGGAT 

2101 TGGTCCT7CG GTGAGGGGGC TGGTCCAGA7 GAAGGCAAGC 7C77GGA7CA 

2151 CCAGCCTTTA CGGCTTAACA ACGATGATTA TGAGCGGGTT CAJi.CAGATTC 

2201 CTG7CAAGAA GGGAGCCAAC TTCCGCGACC TAAAGGGCG7 GAGGG7TGGA 

2251 GCAAACAA7A 77G7TGAGTG GGATCCAGAA A7CGAGCG7G TGAAAC7T7C 

ZiC ATCTGGGAAA CCACTGGTTC CTGACTATGC AATGTCATTC ATCA-AGGGCA 

2 351 AATCACTCAA GCCG77-GGG CGCC7GTGG7 GGGACGAGAC AGT7CC7ACA 

ZAOl G77G7AACCA GAGCAGAGCC "CACAACCAG G77A7AA7TC ATCCGAC7CA 

2 451 AGCAAGGG7C C7CAC7A7CC GGGAGAACGC AAGG77ACAG GGC77CCCCG 

2501 ATTACTACCG A7TG7TTGGC CCGATCAAGG AGAAG7ACA7 7CAAG7CGGG 

2 601 AGCC7ACCTG GG7GAA7CTG ■AGGGGAG7GA CCCTC7G7AC CAGC7GCCTC; 

27D: CC7GT7GGCA rCCC7GCAGG GGAGG7AG77 GAGCAG 
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FIG. IB 



1 AGAGCAGCAG CAGCTACC6C AGCCCCTGCC ATGGCGCCGA GCTCCCCGTC 

51 ACCCGCCGCG CCTACACGCG TCTCTGSGCG GAAlGCGCGCC GCCAAGGCCG 

101 ASGAGAXCCA. CCAGAACAAG GAGGAGGA56 AGGAGGTCGC GGCGGCGTCC 

151 TCCGCCAAGC GCAGCCGCAA GGCGGCATCT TCCGGGAASA ASCCCAAGTC 

201 GCCCCCCAAG CAGGCCAAGC CGGGG?^GGAA GAAQAAGGGG GATGCCGAGA 

251 TGAAGGAGCC CGTGGAGGAC 6ACGTGTGCG CCGAGGAGCC CGACGAGGAG 

301 ^GTTGGCCA TGGGCCAGGA GGAGGCCGAS GAGCAGGCCA TGCAGGAGGA 

351 'GSTGGTTGCG GTCGCSGCGG SGTCACCCGS GAAGAAiSASG GTGGGGAGAA 

401 GGAACGCCGC CGCCGCCGCT GGCGACCACG AGCCGCAGTT CATCGGCASC 

451 '"^CCTGTTGCCG CGGACGAGGC GCGCAGCAAC TGGCCCAAGC GCTACSGCCG 

501 CAGCACTGCC GCAAAGAAAC CG<SA!rGAGGA GGAAGAGCTC AAGGCCAGAT 

551 GTCACTACCG GAGCGCTAAG GTGSACAACG TCGTCTACTG CCTCSGGGM 

601 GACGTCTAIG TCAAGGCTGG AGAAAACGAG GCAGMTACA TTGGCCGCAT 

€51 TACTGAATTT TTTGAGGGGA CTSACCAGTG TCACTATTTT ACTTGCCGTT 

7 01 GGTTCTTCCG AGCAGAGGAC ACGGTTATCA ATTCTTTGCiT, GTCCATAAGT 

751 GTGGATGGCC ACAAGCATGA CCCTAGACGT GTTTTTCTTT CTGAGGAAAA 

601 GAACGACAAT GTGCTTGATT GCATTATCTC CAAGGTCAAG ^ATAGTCCATG 

651 TTGATCCAAA TAIGGATCCA AAAGCCAAGG CTCAGCTGAT AGAGAGTTGC 

901 GACCTATACT ATGACATGTC TTACTCTGTT GCATATTCTA CATTTGCTAA 

351 TAfCTCGTCT GAAAATGGGC AGTCAGGCAG TGATACCGCT TCGG6TATTT 

1001 CTTCTGATGA TGTGQATCTG GAGACGTCAT CTAGTATGCC AACGAGGACA 

1051 GCAACCCTTC TTGATCTGTA TTCTGGCTGT GGGGGCAirGT CTACTGCTM 

1101 TTGCTTGGGT GCAGCTCTTT CTGGCTTGAA ACTTGAJACT CGATGGGCTG 

1151 TTGATTTCAA CAGTTTTGCG TGCCAAAGTT TAAAATATAA TCATCCACAG 

1201 ACTGAGGTGC GAAATGAGAA AGCCGATGAG TTTCTTGCCC TCCTTAAGGA 

1251 ATGGGCAGTT CTAIGCAAAA AATATGTCCA AGAHGTGGAT TCAAATTTAG 

1301 CAAGCTCAGA GGATCAAGCG GATGAASACA GCCCTCT7GA CAAGGACGAA 

1351 TTT6TTGTAG AGAAGCTTGT CGGGATAIGT TATGGTGGCA GTSACAGGSA. 

1401 AAATGGCATC TAT2T7AAGG TCCA6TGGGA AGSATACGGC CC7GA6GA6G 

1451 ATACATGGGA ACCGATTGAT AACTTGAGTG ACTGCCCGCA GAAAATTAGA 

1501 GAATTTGTAC AASAAGGGCA CAAAAGAAAG ATTCTCCCAC TGCCTGGTGA 

1551 TGTTGATGTC ATTTGTGGAG GCCCACCAHTG CCAAGGTATC AGTGGGTrTA 

1601 ATCGGTIACAG AAACCGIGAT GAGCCAC7CA AAGAXGAGAA AAACAAACAA 

1651 ATGGTGACTT TCATGGATAT TGTGGCGTAC TTGAAGCCCA AGTATGTTCT 

1701 CATGGAAAAT 6TGGTGSACA TACTCAAATT T6C6GATGG7 TACCTAGGAA 

1751 AAIATCCTTT GAGCTGCCTT CTTGCTATGA AGTACCAAGC GCGGCTTGGA 

1601 ATGATGGTGG CTGGarTGCTA TGGTCTGCCA CAGTTCAGGA TGCGTGTGTT 

1851 CCTCTGGGGT GCTCTrTCTT CCATGGTGCT CCCTAAGTAT CCTCTGCCCA 

1901 CCTATGATGT TGTAGTACGT GGAGGAGCCC CTAATGCCTT TTCGCAATGT 

1S51 ATGGMGCAT ATGACGAGAC ACAAAAACCA TCCCTGAAAA AAGCCTTGCT 

2001 TCrrGGCQAI GCAATTTCAfi ATTTACCAAA GGTTCAAAAT CACCAGCCTA 

2051 ACGAIGO^GAT GGAGTATGGT GGTTCCCCCA AGACCGAATT CCAGCGCTAC 

2101 ATTCGACTCA GTCGTAAAGA CATGTTGGAT TGGTCCTTCG GTGAG6GGGC 

2151 TGGICCASAT GAAGGCAASC TCTrTGGATCA CCAGCCTTTA CGGCTTAACA 

2201 ACGATGAJPTA TGAGCGGGTT CAACAGATTC CTGTCAAGAA GSGAfiCCAAC 

2251 TTCCGCGACC TAAAGGGCGT GAGGGTTGGA GCAAACAATA TrrsrDTGAGTG 

2301 GGATCCAGAA ATCGAGC6TG T6AAACTTTC ATCTGGGAAA CCACTfiGTTC 

2351 CTGACTATGC AATG7CATTC ATCAASGGCA AATCACTCAA 6CCGTTTGGG 

2401 CGCCHGTeer gggacgagac agttcctaca gttgtaacca gagcagagcc 

24S1 TCACAACCAG GT7A!IAATTC AXCCGACTCA AGCAAGGGTC CTCACXATCC 

2501 GGGAGAACGC AAGGTTACAG GGCTTCCCCG ATXACTACCG ASTGTTTGGC 

2551 CCGATCAAGG AGAAGT;^:A3 TCAAGTCGGG AACGCAGTGG CTGTCCCTGT 

2601 tgcccgggca ctgggctact GTCTGGSGCA AGCCTACCTG GGTGAATCTG 

2651 AGGGGAGTGA CCCTCTG^C CAGCTGCCTC CAAGTTTCAC arCTGTTGGA 

2701 GGACGCACTG CGGGGCAGGC GAGGGCCTCT CCTGTTGGCA CCCCTGCAGG 

2751 GGASGTAGTT GAGCAGTAAA AGGATGACAG ATCTSAGCT6 AGCTGO 
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FIG. 2A 

i MArt'SPSF.z^Ji PTRVSGRKFJ^. 

5i SGKKPKSPPK QAKPGRKKKG 

101 EQAMQEEVVA VAAGSPGKKR 

151 WPKRYGRS7A AKKPDEEEEL 

201 ADYIGRITEF FEGTDQCHYF 

251 VFLSEEKNDN VLDCIISKVK 

301 AYSTFANISS ENGQSGSDTA 

351 GGMSTGLCLG AALSGLKLET 

4 01 FLALLKEWAV LCKKYVQDVD 

4 51 YGGSDRENGI YFKVQWEGYG 

501 ILPLPGDVDV ICGGPPCQGI 

551 LKPKYVLMEN VVDILKFADG 

601 QFRMRVFLWG ALSSMVLPKY 

651 SLKKALLLGD AISDLPKVQN 

701 WSFGEGAGPD EGKLLDHQPL 

751 ANNIVEWDPE lERVKLSSGK 

801 WTRAEPHNO VIIHPTQARV 

3 51 NAVAVPVARA LGYCLGQAYL 

901 PVGTPAGEVa' ZQ 



912 amino acids 

AKAEEI.HQKK EEEEEV.'J^i^.S SAKRSRKAja.S 
•JAEMKEPVED DVCAEEPDEE ELAMGEEEAE 
VGRRNAAAAA GDHEPEFIGS PVAADEARSN 
KABCHYRSAK VDNWYCLGD DVYVKAGENE 
iCRWFFRAED TVINSLVSIS VDGHKHDPRR 
IVKVDPNMDP KAKAQLIESC DLYYDMSYSV 
SGISSDDVDL ETSSSMPTRT ATLLDLYSGC 
RWAVDFNSFA CQSLKYNHPQ TEVRNEKADE 
SNLASSEDQA DEDSPLDKDE FWEKLVGIC 
pee:)twepid NLSDCPQKIR EFVQEGHKRK 
SGFNRYRNRD EPLKDEKNKQ MVTFMDIVAY 
YLGKYALSCL VAMKYQARLG MMVAGCYGLP 
PLPTYDVVVR GGAPNAFSOC MVAYDETQKF 
HQPNDVMEYG GSPKTEFQRY IRLSRKDKLD 
RLNNDDYERV QQIPVKKGAN FRDLKGVRVG 
PLVPDYAMSF IKGKSLKPFG RLWWDETVPT 
LTIRENARLQ GFPDYYRLFG PIKEKYIQVG 
GESEGSDPLY QLPPSFTSVG GRTAGQARAS 
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FIG. 2B 



RAAAA.7A?iFA.>C;p 5 S P S PAAP T RVS GRKBAAICAZEIHQNKEiZEEVAAA5 
SAKa£ RXAAS S GKXF KS P? KQAKPGRKKKGDAEKKEEVEDDVCAEEPDEE 
SIAMGSEEAEEQAMQEZVVAVAAGS P GKKRVGRWIAAAAAGDHE5EFI GS 
PVAADEARS>rrfPKPyGR3TAAXKPDEEEELKARCHyRSAKVX)NWYCLGD 
DVYVKAGENEADYIGRITEFFEGTDQCKYFTCRWFFRAEDTVINSLVSrS 
VDGHJCiDFRRVTLSEEKNIWVT,DCIISKVKIVHVDPNMDPKAKAQLIESC 
DLYYDMSySVAYSTFANISSENGQSGSDTASGISSDDVDLETSSSMPTRT 
ATLLDLYSGCGG^^STGLCI/GAALSGLKL£TRHAVDFNSFACQSLX«^HPQ 
■. TEVRNE:KAOEFIAI.LKEWAVLCKiarVQr>VDSMLASSEDQADEDSPLDKllE 
■ FWEKLVGI CYGGS DRENGI YFKVQWEGYGPEEDTMEP IDNLSDCPQKIR 
EFVQEGHKRKI LPLPGDVDVI CGGPPCCGI S GFNRYRXRDEPLKDEKNKQ 
MVTf>DIVXYLKPKYVLMENVVLIlKFADGVLGKYALS CLVJ^MKYQARLG 
MMVAGCYGLPQFRKRVPLWGALSSMVLPKYPLPTYDVV^/RGGAPflAFSQC 
MVAYDSTQKPSLKKALLLGDAISDLPKVQNHQPNDVKBYGGSPKTEFQRY 
IRLSRKDJILDWSFGEGAGPDEGKLLDHQPLRLNNDDYERVQSIPVKKGMI 
FRDLKGVRVGANNIVEWDPEIERVKLSSGXPLVPDYAMSFIKGXSIiKPFG 
RLWVnDETVPTVVTRAEPHMQVIIHPTQARVLTIREKARLQGFPDYrRLFG 
PrKEKYlQVGNAVAVPVARALGYCLGOAVIGESEGSDPLYOLPESFTSVG 
GRTAGvARAS PVGTPAGEWEQ-KDDRSELSW 
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FIG. 3 



Primer 


SequGnce 5' - 3' 


IF 


TGGTTGCTATGGTCTGCCAGAGTTCAG 


1 R 


CCAGCTCAGCTCAGATCTGTCATCCTTT 


OcU^r IN 


CGAAAGCTAATCTACACAAACAGC 




GATCCTCTGAGCTTGCTAAATTTG 


3R 


CTCATCTTGGAGTGGCTCATCAC 


S3F 


GAGCACATGAGGGAGAGTGTTG 


S3R 


TCTCTAATTTTCTGCGGGCAG 


4F 


CCTCTGCCCACCTATGATGTTGTA 


5F 


TAAAGGGCGTGAGGGTTGGA 


7F 


TCACATTTGTCATGGCAGGTTATC 


8eF 






(^PAATPAAf5PAPATTf5TPf^TTrTTTTPPTP 


9eF 


GAAGAAGAGGGTGGGGAGAAGGAACG 




1 1 1 1 1 OOOOO/AO 1 O ( O \^ o 


1 1iF 


GTATTGAATTGATTCTCAACTAGTGCAC 


1 1 iR 


CAGGCTCAACGGCGATG 


12iF 


TATGCTTCATCACATAGACCCAAGTC 


12iR 


GATAGACCTAATGCCAAATGAGATTAAG 


13iF 


GCGATCTTCAGTCTCCACCATC 


13iR 


GAAGACGTGCCTCCATGTTTCATC 


14F 


GTTGGTTCTTCCGAGCAGAGG 


14R 


G ACTGCC ACATATCTTATT AATC GC 


15F 


GCATGTGTCAGCAATTGCTTACATTC 


15R 


CCTCTGCTCGGAAGAACCAAC 


16F 


CTGTTCGGAGATTCATGCATGATG 


16R 


A r5 A A P A A A Tf^ r5 TT A TT P A A T 


17F 


GCACTTCACTCTCCTGGCAAACC 


17R 


CGGTACGCTGCTGCTGCTCTC 


18F 


CCATAGCATCTCACATATCGCAAGG 


18R 


GGAAAGAAGGCAGTTAGTTGTAAATGGG 


MU 


AGAGAAGCCAACGCCAWCGCCTCYATTTCGTC 


RaceRT 


CTACAACATCATAGTTGGGCAGAGG 


AP2 marathon 


ACTCACTATAGGGCTCGAGCGGC 


T7 


TAATACGACTCACTATAGGG 


Sd6 


GATTTAGGTGACACTATAG 


M13F 


GTTTTCCCAGTCACGAC 


M13R 


CAGGAAACAGCTATGAC 
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Figure 5 

Hhal Msel 
tacatcaataaaataaggggcgccaacgcaattgtcccttGttttttctaacttaaagttcaagcggcaatgtcg base pairs 
atgtagttattttattccccgcggttgcgttaacagggaacaaaaaagattgaatctcaagcccgccgtcacagc 1 to 75 

Msel Msel 

cacccgatgtacgaacatcaatt cgtaagtaccaaacacagccctaaccccaaaattaaccacacaacaaagact base pairs 
gcagaccacatacttacagttaaacactcacgactcgcatcagaattaaggctttaactaacgcgccgcctctga 76 to 150 

Msel Hinfl 
aaattgtaagcaaacctttcaagtctaattaattcataactacaaatgctatcgtaacatcatgccaccgaatca base pairs 
tctaacactcgcttggaaagttcagattaattaagcatcaacgcteacaataacattgtagcacaacggcttagt 151 to 225 

ScrFI 

EcoRII Msel 
caaactaaccaggcccccatgtgtaactagttttataatcatattacacttaacattcgtaaccaattgatgtga base pairs 
atctgactggcccaagggcacacaECaaccaaaacaccaaCacaacacaaaccataaacattgaccaactacacc 226 to 300 
BstNI 

Msel Msel Msei 
cagcactaaaactaagcctcttaagccaaaaaatccacatactctagatccaaaatttgaaaacagacgtatcgg base pairs 
gccacgattttaattcggagaattcggttttttaggtgtataaaacctaaatcttaaacttttgtctgcatagcc 301 to 375 

Haelll 

;tacaagaagtggcccataccagttccatcaccagtccagtag base pairs 
latgttcttcaccgggtatgatcaaggtagtggtcaggtcatc 376 to 450 

Pvull Haelll Haelll Hhal Hhal Hpall 

tccaccaccccaccctacagctgggtcatctggcacgggtggaggggccaacggccaaaagcgccgcgcacttcc base pairs 
aggtggtggggtgggatgtcgacccagcagaccgcgcccacctccccggctgccggttttcgcggcgcgtgaagg 451 to 525 

Mspl 

Apal 

Hinfl Psti ECOO109I 

ggcgggcaccctcgcggagtcgcgggcgacagcgaaatttcaaatccataccctcccgctgcagacgggccccac base pairs 
ccgcccgtgggagcgcctcagcgcccactgtcgctttaaagtttaggtatgggagggcgacgtctgcccggggtg 526 to 600 

Haelll 

Taqi 

gccgccaaaacttggacgctcccgctccctcgaccccttgggttccgtttccccagttcccaccctctcctccac base pairs 
cggcagttttaaacctgcgagggcgagggagctagaaaacccaaagcaaaagggtcaagggtgggagagaaggtg 601 to 675 
Sau3AI 

Sau3AI TaqI TaqI Hinfl 
cctgccctgtttccagatttgaccgatccccttcgattcgatttctacacccacggtgtccagactccagaacac base pairs 
ggacgggacaaaggtctaaactggctaggggaagctaagctaaagatgtgggtgccacaggtctgaggtctcgtg 676 to 750 
Hinfl 

ScrPI 
17 F EcoRll 

r.captccgrtQocaaacc cccttcQtctteccaaccctaoaqaqcaqcaqcaqctaccqcaqcccctoccatqqc base pairs 
agtLgagaggacrg t c tqgggaaageagaagggt toqqa t ct ct catCQtcat coa t aoc qtcaQQaacagtaCca 751 to 825 
BstNI 17R 
Hhal Sad Hhal Hhal Haelll Sau3AI 

gccgagctccccgtcacccgccgcgcctaeacgcgtctctgggcggaagcgcgccgccaaggccgaggagatcca base pairs 
cggctcgaggggcagtgggcggcgcggatgtgcgcagagacccgccttcgcgcggcggttccggctcctctaggt 826 to 900 

ScrFI 

Hhal Hpall 
ccagaacaaggaggaggaggaggaggtcgcggcggcgtcccccgccaagcgcagccgcaaggcggcatcttccgg base pairs 
ggtcttgttcctcctcctcctcctccagcgccgccgcaggaggcggttcgcgtcggcgttccgccgtagaaggec 901 to 975 

Mspl 

Mspl 
Haelll ScrFI 

gaagaagcccaagtcgccccccaagcaggccaagccggggaggaagaagaagggggatgccgagatgaaggagcc base pairs 
cttcttcgggttcagcggggggttcgtccggttcggccccccccccttcttccccctacggctctacttcctcgg 976 to 1050 
Hpall 
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FIG. 5 

Hhal Continued Haelll Haelll 

cgtggaggacgacgtgtgcgccgaggagcccgacyaggaggagttggccatgggcgaggaggaggccgaggagca t 
gcacctcctgctgcacacgcggctccccgggccgctcccccccaaccggtacccgctcctcctccggccccccgt 3 

Msp: 

Haelll Hpall 9eF 

ggccacgcaggaggaggtggti:gcggtcgcggeggggtcacceggggggg^ggoggtq.gqg?qat" 
ccggtacgtcctcctccaccaacgccagcgccgccccagcgggcccccctcctcccacccccctti 
ScrFI 
Smal 

Hpall Hhal Haelll 

cgccgccgctggcgaccacgagccggagtccatcggcagccctgttgccgcggacgaggcgcgcagcaaccggcc base pairs 
gcggcggcgaccgctggcgctcggcctcaagcagccgtcgggacaacggcgcctgctccgcgcgccgttgaccgg 1201 to 1275 

Msp I 

Hhal Haelll 

caaagcgctacggccgcagcacttgccgcaaaagaagtacactactccctcccagctccggccccgacccgacca base pairs 
gfrrrgrgargggg gcqtcqtoaacoQCQCCC ccct eacataataaaaoaqoqtcQaqaecaaaaceaaaecqqt 1276 to 13S0 
9eR 

Hpall 

gatctttactccacgtctgttagtactcgcgagctgagcaacccgctactcgctgatttaccgcgcgtgcagacc base pairs 
ctaaaaacgaggtacagacaaccatgaacgctcgactcgttagacgacaaacgactaaataacacgcacgcccgg 1351 to 1425 

Mspl 

Sad Haelll Hpall Hhal 

ggatgaggaggaagagctcaaggccagatgtcactaccggagcgctaaggcggacaacgtcgtccactgcctcgg base pairs 
cctactcctcctccccgagccccggtctacagtgacggcctcgcgatcccacctgctgcagcagatgacggagcc 1426 to 1500 
Mspl 

EC00109I 

ggatgacgtctatgtcaaggcccttgtccatcgctctctgttgcttctgccctcattcatgatgtgcatatgtgt base pairs 
cctactgcagatacagttccaggaacaagtagcgaaagacaacgaagacgagagtaaatactacacgcatacaca 1501 to 1575 
Avail 

Msel Hinfl Hpall 

ttgttaaggaagcaagaattgcttgatttttgctgccgactcgcatttccgtgacgagttctgcgtatggtcacc base pairs 
aacaattccttcgttcttaacgaactaaaaacaacggctgagcgtaaaggcactgctcaagacgcataccagtgg 1576 to 1650 

Mspl 

ScrFI 

TaqI BstKI Sau3AI 

ggtacgcggcaccgatacacaacgcggtatgccggaagtctggtagcataCtttgcatcgaccaggaggtccaga base pairs 
ccatgcaccgcgactatgtgttgcaccatacgaccttcagaccatcacataaaacgtagctggccctccaggtct 1651 to 1725 

EcoRII Avail 

Clal Hinfl 
tcgatacgtgeggtatagtgcttatttgatcgcaecccgttcggggaetcacgcgtqflcggcgtgtctagatgac base pairs 
agctatacacgccacatcacgaataaaccaacgtgggacaagcctctaagcacgtactaccgcacaaacctaccg 1726 to 1800 

TaqI 

ScrFI BstNI 
Pvull EcoRII PvuII Haelll Hpall Hhal Hinfl 

gcctcccagacagccgcccgccaggcagctgattctggcccaggcgtccggaatggtgaagttgcgctggcaaga base pairs 
cggagggtctgtcgacggacggtccgccgactaagaccgggtccgcaggccttaccacttcaacgcgaccgttct 1801 to 1875 
BstNI Hinfl EcoRII Mspl 
ScrFI 

ScrFI 

Haelll EcoRII 
ttctcaggccacctaccaaatatgccctggagcatattgcacgcttctttttttgttctctttccctctatattt base pairs 
aagagtccggtggatggtttatacgggacctcgtataacgtacgaagaaaaaaacaagagaaaggaagatacaaa 1876 to 1950 
BSCNI 

atcccattgttagtgaagtttcacatcgcacgcgtcatggaatatttactctcaaatcaacgaggagatgctagc base pairs 
tagagtaacaatcacttcaaagtgtaacgcgcacagcaccttacaaatgaaagtctagccgcccctccacgaccg 1951 to 2025 
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accgaggcgtgcgataatcaccacatactagaagacatcgcgcatgttgccactgggattgcgaagaatgcggaa base pairs 
taaccccacacactactaacaatgtatgaccctccacagcacgcacaacggtaaccccaacgccccttacacctc 2026 to 2100 

Msel Ms el 

atgacgtcggtagcctgtatcaagaggttaacagtcagtgggacgacacgaccattagctagagacgatgcggat base pairs 
tactacaaccatcgaacataatcctccaattgtcaatcaccccactgcactgacaatcaatctccactacaccta 2101 to 2175 

agtaagtgggatatgatgtagatgacttgtgtgttgagacagaactataacacggagttggaaatgggagcagca base pairs 
tcattcaccctatactacatctactgaacacacaactccgtcttgatattgtacctcaacctttaccctcgtcgt 2176 to 2250 

Msel 

cggtcaaacataccetaaatgcctgtctccacacaatgtggtgattggtgtatagcctggtgttaaaagctggat base pairs 
accagtttgcatgggatttacggacagagatgtgttacaccactaaccacatatcagaccacaattttcgaccta 22S1 to 2325 

ScrFI 

Hinfl Msel Xbal EcoRlI 

actttgattctgttgaagactgtcacacccgaacttaaggacaaatctagatacatctcatatgtgcaccaggat base pairs 
tgaaactaagacaacttctaacagtgtgggcttaaattcctgtttagatctatgtagagtatacacgtggtccta 2326 to 24 0 0 

BstNI 



Msel Msel 
cacctttcaacatgtttaatgctgcaaactgctttaattaaacagaacgcagtgttttgaacaaaaaaatgctgc base pairs 
gtggaaagttgcacaaaccacgacgtctgacaaaactaatctgtcccacgccacaaaacccgtctctttacgacg 2476 to 2550 



iSiF Sau3Al Hinfl Msel 

tCf-atgecocatetcgtttt QcatQtatcagraatcocttacatt ccattacoatcEctaaoattccttaaattt base pairs 
aaacaggacgcagaacaaaacgtacacagtcgttaacgaatgtaaggcaatactagagactctaagaaatttaaa 2551 to 2625 

ctagcatgatgaaagtatttaccaattcaactgaacacaaacattgcctgaatgaacaaggcaacacggatgctt base pairs 
gatcgtactactttcataaatgattaagttgacttgtgtttgcaacaaacttacttgttccgttgtgcctacgaa 2626 to 2700 

Msel 

ggaataatggttgtgtataataccacttagtggttctgctctcacaccacacctttcatgggttccttaataata base pairs 
ccttattaccaacacatactatagtgaatcaccaaaacgagagtgtggtgcagaaagcacccaagaaattactat 2701 to 2775 

Msel Haelll 
gttactgactttaagtttcttattcccttttgtctatcttagctggagaaaacgaggcagattacattggccgc base pairs 
caatgactgaaactcaaagaataaggaaaaacagatagaatcgaccccttttgctccgtctaatgtaaccggcg 2776 to 2850 

l4eF 

igacacg base pairs 
laaooctcotctcc tqtac 2851 to 2925 



Msel Haelll Haelll 
tacaccacacacacttctagtatatgtgtacacgttaatgggccaacactagacacatggcccaacatccccct base pairs 
atgtggtgtgtgtgaagatcatatacacatgtgcaattacccggtcgcgatctgtgtaccgggccgtaggggga 3001 co 3075 
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gttctacccgctatctatagctagtaggggtagaacgacgcactgtgcagtgcgagaaaatgaggatatgggaat 3076 to 3150 
Hinfl 

gccaagcaatctgccattcgacctcctgagtttacacgatccaaccctaaagcaccatcatccaacccccctctg base pairs 
cagctcgctagacgacaaaccggaaaacteaaacgcactaagctgagatctcacggtaatagattgaagagaaac 31S1 co 3225 

Clal 
Hinfl 

acgaagaatcgatcaatttccacatgtttcgttctaccatgtcgaactggattgtcagctatacccatggctgac base pairs 
cacttcttagctagtcaaaggcgtacaaaacaagacagcacaactcgacccaacaaccgatataagtaccgactg 3226 to 3300 
TaqI 
SauSAI 

Hinfl Msel ISeF 

tcactaccacaccataaccccagggagtcccctcccaatacactcaaccccgataagagaccccccac££aiac[£ base pairs 
aataacagcgcggtactgaagcccctcagaaaagaattacgtaagttgagactactctctgggaaacaggcaccg 3301 to 3375 

Haelll 

atetcacaf.arracaaga qccataactcaatattctQcttcaQcqqtqqaacQqgataccacaqattotttctta base pairs 
tagagcgcatagcgctcccggtaccgagccataagacgaagccgccaccttgccctacggcgtctaacaaagaac 3376 to 3450 



ScrFI 

Msel EcoRIl 
ccccagtctgcatcagagtaaccttccacccttagatgaccatgacctttaaagattattccctttccaggacaa base pairs 
ggggtcagacgtagtctcattggaaggtggaaatctactggtaccggaaacttctaataagggaaaggccctgtt 3526 to 3600 

BstNI 

TaqI 

gtcctcaagtatcgcagtatacgatacaccgcatcaagatgtccacttctggggtcatgcatatatcgactcacc base pairs 
cagaagttcatagcgtcatacgctatgcgacgtagttctacaggtgaagaccccagtacgtatatagctgagtgg 3601 to 3675 

Hinfl 

EcoRV 

acaccgactgcatacgtgatatcaggtcctgcatggcacaagtagacgagccgtccaacaagtctttgatacctt base pairs 
tgtgactgacgtatacactatagtccagaacataccgtgttcatctacccggcaggttgttcagaaaccatggaa 3676 to 3750 

Sau3AI Hinfl Hinfl TaqI Haelll 

tctttattcacaggatcaccagactcagcacataattcatgactcaagccgacaggtgccgctacaggccgacac base pairs 
agaaacaagtgtcctagcggtctaagtcgtgtattaaatactaagttcagccacccacaacgatgtccggccgtg 3751 to 3825 

Sau3AI Sau3AI 
cccaacatacctgtttcatcaagtagatctaaaacatatttcctttgggagagaactattccttttggagatcga base pairs 
gggCtgtatggacaaagtagcrcatctagactttgtataaaggaaaccctctcctgataaggaaaacctctagct 3826 to 3900 

Bgill TaqI 

Sau3AI Hinfl 
gcaatctcaataccaagaaagtatttgagatgaccaagatctttaaccccaaatcccttacttagattctccttt base pairs 
cgttagagttatggttctttcataaactctaccggttctagaaactggagtttaaggaatgaatctaagaagaaa 3 901 to 3 975 

Bglll Msel 

Sau3AI 

agacacgcaatctcaagatcggcatcacctgtaataacaatatcatccacatacacagctagaattgcaactcgt base pairs 
tctgtacgttagagttctagccgtagtggacattattattacagtaggtgtatgtgtcgatcttaacgttaagca 3976 to 4050 

Sau3AI 

cgcccaaagcgctgacaaaaaacagcgcgacctccgccgcatcgcccatatcccatgccacacactgcacgccca base pairs 
gcaggttccacaactattttccgtcacactagaggcaacgcaacaaatatagggcacgacgtacaacgtgcagat 4051 to 4125 

TaqI 

aatctgtcaaaccatgctcttggggactgcttgagaccatacaatgattttctcaatcgacaaactttcccaatt base pairs 
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tcagacagcctggtacgagaacccctgacgaactctggtatgtcaotaaaaaagttagccgcttgaaagggttaa 4126 to 4200 

ScrFI 
EcoRII Sau3AI 

gtctcaggctccgacaacccaggagggacctccatacagacccccccttgcaaaccaceacgtaagaaagcaccc base pairs 
cagagcccgaaactgccaggtcccccccagaggtacacctggaggagaacgctcagcggtacaccctctcgtaag 4201 to 4275 



Msel Haelll Sau3AI 

ttaacatctagttgatacaagggccatccaaaattcgcagcacaagagatcaatgcccttacagtactcactttt base pairs 
aactgcagatcaactatgttcccggtaggtcttaaacgtcgcgttctctagttacaggaatgccacgagtaaaaa 42'76 to 4350 

gccactggtgcaaatgcctcatcacaatcaattccataCgtccgaccacaccctcctgcaaccaatcctgcttca base pairs 
cggtgaccacgttcacagagcagcaccagctaaggcacacaaaccgatatgggagaacgccggccagaacgaaac 4351 to 4425 

tatcgttctacccttccttctgggctttgcttcacagtgaacacccatttacaactaactgccttctttccttta base pairs 
at-agraagaCQQgaaaQaaQacccaaaagaaaQtatcacttataqataaatqttQattaacqgaaQaaaaaaaat 4426 to 4 500 

18iR 

Xbai Msel 

ggcagtttctcaaattcecaagtctgatttttttctagagccttaagctcctccaacattgcctcacgccagtta base pairs 
ccatcaaagagtttaagggttcaaactaaaaaaagatctcgaaattcgaggaggttgtaacggagtgcggtcaat 4501 to 4 575 



Hinfl 

ggtgacaaagacgcacacgagacataattgctaatgtcatgttcatatccacaccttgttggggggactccagcc base pairs 

ccaccgtttctgcgcatactctgcattaacgattacagtacaagtataggcacggaacaacccccccgaggtcga 4651 to 4725 

Hhal 

tcagcacgcgctccttctcgtattgcaatgggcaaatcacaagcgtcacaatcctcagctcctccacgagaegtc base pairs 

aatcgtgcgcgaggaaaagcataacgttacccgtttagcattcacagtattagaagtcaaagaggtactccgcag 4726 to 4800 

aaaggtacatttatagcctctaatgtgtctggagagaactgctcagtacttgacgccgaactggtttcaggagcc base pairs 

tttccatgtaaatatcggagactacacaaacctctcttgacgagtcatgaactacgacttaaccaaagccctcgg 4801 to 4 875 

tgaggttgcacatgggactttcttcttgtatatacttcgcccttatatcgtaagtcgtctccacaagattcatta base pairs 

actccaacgtgtaccctgaaagaagaacatatatgaagcgggaatatagcattcagcagaggtgttctaaataac 4876 to 4950 



Hinfl TaqI 
attttatttggttgttgttccatcgaatcaaccactctgttctccccctctcgactagctccatctgtgctagta base pairs 
taaaataaacgaacaacaa aataacttaattaataaaacaaoaqo gQaaqaocCQatcqaaataqacacqatcaC 5026 to 5100 
ISiR 

Hinfl Sau3AI 
gagacagaatcaagaaaaaaatctagacctgtctcctcaccatagaaaggcacagtctctctaaatgcaacatcc base pairs 
ccccgtcttagttctttttttaaatccagacagaagagtggtatcttcccgtgtcagagagatttacactgt.agg 5101 to 5175 
Bglll 

Hinfl PstI 
atgcctacaaacaaacgtcgttcactaggactccaacacttgtatcccttttgccctgcaggatatccaacaaaa base pairs 
tacgaatgttcgtttgcagcaagcgatcccgaggctgcgaacatagggaaaacgggacgtcctacaggttgctcc 5176 to 5250 

EcoRV 

BamHI Sau3Al 
atgcacttcacagcacgaggatccaacctccccacctgaggtctatgatccctgacaaaacatgtacatccaaaa base pairs 
tacgtgaagtgtcgtgctcccaggttgaaggggtggactccagatactagagactgttttgtacacgtaggtttt 5251 to 5325 
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Hinfl Hinfl 

atttcaggcggaaccacaaacttattcccaccgagaagaatctcacatggagccttcaccgcaagtatttttgaa base pairs 
caaaacccacctcggtgctcgaataagagtggctcttcttagagcgtacctcagaagtaacgttcacaaaaactt 5326 Co 5400 



ggagtgcgaccaataagacatgcggcagtcaatacagcttcaccccataggaacctcggaacattcactgtaaac base pairs 
rrrea cQrr.aattatrrratacaccQtcaq ctatqtcqaaqtqaqqc.atcctCqaaqcctcqtaaqtaacaCttq 5401 to 5475 
14iR 

Hinfl 

atcagcgaacgagcaacttccaaaatgtgacgattctCcccttcagccacaccattttgtggaggtgcatcagga base pairs 
cagccgcttgctcgttgaaggttctacactgctaagaaggaaagccggtgcggtaaaacacctccacatagtccc 5476 to 5550 



Msel 

caggatgtctgacgtaatataccatttcttgacagaaatgcattaaatcccttgtttacatactcggttccattg base pairs 
gccctacagactacattatatggcaaagaactgtctttacgtaatttagggaacaaatgtatgagccaaggtaac 5551 to 5625 



lliF Hinfl 

rf-rggrrrraggarrrrgartr.aaar.atcaaattaatcctcaactaatacacaaaaattttaaaaacacttcaat base pairs 
agaccagaatcctaaaactgaactcacaacttaactaagagttgatcacgtgtttctaaaacttttgtgaagcta 5626 to 5700 



12iF Taql 
af-fttra^ptt hatactteaceacat aoaeeeaaoccattccqaqaaaaacaatcqataaaqtaaeaaaqtacttc base pairs 
tgaagtagaaatacgaagtagtgtatctgggttcagtaaggccctttttgttagctatttcattgcctcatgaag 5701 to 5775 

Clal 



Msel 

accccatcaacagaagtcacaggacatgtccaaacatcagaatgaactagcacaaaaggagatatactcctgata base pairs 
tagggtaattaccttcagtgtcctgtacaggtttgtagtcttacttgatcgtgttttcctctatacgaggaccat 5776 to 5850 



Taql 

cctcgactaatataagatgtccttgtgtgttttgcaaactcacaggcaccacacaatagcttgctttcatccacc base pairs 
ggagctgattatattctacaggaacacacaaaacgtctgagtgtccgcagtgtgttatcgaacgaaaataggtgg 5851 to 5925 

Hindlll 

ccactcattacatcaggaaaagctccgcatatcctatcaaaagaaagatgccctaacctacaatgcaagagcatc base pairs 
ggtgagtaatgcagtccttttcgaaacgtatagaatagttttctttctacgggattagatgttacgttcccgtag 5926 to 6000 

Sau3AI 

actgcaacctccttctcttccattcttgttgccagcatagtgcatattgtaccattagtccccccatgatccata base pairs 
tgacgttggaggaagagaaggtaagaacaacggtcgtatcacgtataacatggtaatcaggggagtactaggtat 6001 to 6075 

ScrFI 

EcoRIl Msel 
taccacaatccattacgcctggtagctgtcccaagtctcttccctgtttccctctcctgaattaaacaattatct base pairs 
atggtgttaggtaatgcggaccatcgacagggctcagagaagggacaaagggagaggacttaatttgttaataga 6076 to 6150 
BBtNI 



Taql Sau3AI EcoRV 

cgatcaagaataatacgacaatccaattgaccaaccaaggcacttagcgatatcaagttgacaggaaaggttggc base pairs 
gctagctcttattatgctgttaggttaactagttggttccgtgaatcactatagttcaactgtcctttccaaccg 6151 to 622 5 
Sau3AI 

Msel 

acatacaaaactgatgacaacttaatagacggagtgcattgcactgtgccaacacccttgatgggttgcggtgta base pairs 
tgtatgttttgactactgttgaattatctacetcacgtaacgtgacacggttgtgggaactacccaacaccacat 6226 to 6300 



EcoRV 

ccatcagcagtttgtataatttctttacgtgtggggggatatcctatatatgatgtaaattcactggacgtgcct base pairs 
ggtagtcgtcaaacatattaaagaaatgcacacccccctatagaatatacactacatttaagtgacctgcacgga 6301 to 6375 
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gtgacacgccttgatgcccccgagcctaaaacccattetaactgcgtgacctgtgcgggcacaaaagcatgagca base pairs 
cactgtacgaaaccacgaggactcagattttaggtaaaatcgacacaccggacacacccacgcttccgtacccgt 6376 co 6450 

Hinfl Sau3AI 

taattacctccatcagcgtaggcgaagtggacaaaatcccctgtgtgagacccctgacccctacccccagagatt base pairs 
aCtaacggaagtagtcacacccgcctcacctgttttaggggacacactctgaggactagaaacagaggtctccaa 64S1 to 6525 

tgatttttcttcctcaactctgtcccatcctcgcgttceacaaacgtttcaagttctcctcgtgtagctgctgca base pairs 
accaaaaagaaggagtcgaaacaaagtagaagcacaaggtactcacaaagctcaagaagaacacaccaacgacgt 6526 co 6600 

Msel 

ttcgcccttgcccaacccctgcctccacgacccctgccgcccctaggagcccctctccctctcccacgattaact base pairs 
aagcgggaacgggtcgaggacggaggtgctggagacggcggagaccctcggggagaaggagagggcgctaattga 6601 co 6675 



ScrFI 
Hinfl Hinfl EcoRII 

cccatagctgaaaacacaggatgaggcggcgtttgagaactttctctcatcactctgagtcttgactcctcctgg base pairs 
gagtaccgacctttgcgtcctaccccgccgcaaacccctgaaagagagtagcgaaacccagaaccgaggaggacc 6751 to 6825 

BstNI 

Taqi 

gatatggcagccatggctccttgcaggctaggaagagtggaccgatgaaacatggaggcacgccttccctcgaac base pairs 
rr.flrar^rgrrgat-aeeQaaaaacatecQaceectctcaeccaa ctactcrararereeoCQcaqaaq aqaqctCq 6826 to 6900 

13iR 

tctgagctcagcccccccagcaactgaagtacacgcctcctccccacccacttcctcgcccaagcaacacaccct base pairs 
agactcaaatcgggggaaccgctaaccccatgcgcagaaaaaaggcgggcaaagaagcgggcccgttgcgcgaga €901 co 6975 

Sau3AI Sau3AI 

gagcgCggcagcccaacaggaccacaatgaCcaacaccagcccacaaacatcgcaacccccgaacgtaccccgcc base pairs 
ctcacaccatcgagtcatcctagtattactagccgtagccgggtatctgtaacactgaggacttgcacgaggcgg 6976 co 7050 

Sau3AI Sau3AI 13 iF 

af?agarfgf?rrf^gef;CQtttaacacraec|qag ocQaggteeaatctcgaccate aaeacaaeatetccagctccc base pairs 
cgcccagcgagggggacaaaccacaatacccccgccagaagtcagaggtggcagccgcaccgcaaaggccgaggg 7051 to 7125 

Psti Hinfl 
gagtacattcctccaagcgccctccacacctctgcagcacccatgattgtatcaacagcgccagcaactgccgga base pairs 
ctcatgtaaagaagttcacgaaaggtgcaaagacgtcgtgaataccaacatagtcgtcacgatcgttaacgacct 7126 Co 7200 

Msel 

atcacagaacCcaacaCccacgctgccacCaaagagcctatagcaccccagtccccccattcatcacttaacCCa base pairs 
tagcaccccgagtcgcaggcgcgacggcgattccccaaataccgtagggccagaaaggcaagcagcgaactgaat 7201 to 7275 

Hinfl 

Msel Xhol Xbal 

Ccctcgggcccaacgacatccccttcaacatagcccccgagcctccttgcccccaacaaccgcaatgccccccta base pairs 
agaaagneQaahtaecacaaaaQaaaccataceaaaaacceaaaqaaagQaaaaeraCca aeaetaeaaqaaoat 7276 Co 7350 
TaqI S4iR 

Msel 

gaccacgccaaataacttttcaccccccccaacttaatctcatttggcattaggtccaccccccgaaccggcccc base pairs 
gtagtacggcccattaaaaagcggggaagaccqaaCtflqaqiflaaccqCaflCCCagataqaagacctgaccaaga 7351 to 7425 
12iR 

Msel TaqI 
acatgagcaacaccgtccctaattgacgatggagcctcaccccctcttgctgacagtaacccgaccaatc caeca base pairs 
tacactcgtcgcaacagaaaccaaccaccaccccggagtagggaaaaacgaccgccaccaagccggccaaacggc 7426 co 7500 
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Continued pvuii 
agaacccgctcaacccctcgactttcccccacaacacatatgcaacaacaaccaggaaaccacctgggcagctgc base pairs 
ccttggacgagttaaggaaccaaaagggggcattacacatacaccaccatcgaccctccgacgaacccgtcgacg 7501 co 7S7S 

EcoRII 
ScrFI 

Sau3AI Xbal EcoRII BstNI Hhal Sau3AI 

gccaagacccgggtcacaacgtctagaagccaggaccaggagcgccccctcttcctccccctccgagctggacgg base pairs 
cagccccagacccagcgctgcagacctccggccccggccctcgcggaggagaaggaggaggaggctcgacccacc 7576 co 7S50 

Bglll BstNI ScrFI 

Avail 

Hpall 

atctcagccacaggacgcgggcagcaggggggagcagcagcacctgcgcgccggcagcctccccaagggccggac base pairs 
tagagtcagcgccccgcgcccgccgtccccccccgtcgccgtggacacacggccgccgaaggagcccccaacccg 7651 to 7725 

Mspl 

PvuII SauJAI 
gagctgcggcagctggagagccccccaagcacccccacccccagatccttgtcgccgacgagtgcccgcgtccac base pairs 
ri-rgargergcegaectcticggagggttcqtgq oqataqa QatctaqaaacaacaQCtactcacqggcqcaqacq 7726 to 7800 
S6IW 

ScrFI 

Haelll ECOO109I BstNI 

gtccttggccgccctcgccttgccggcggtggcgtcctctgcgctgtggctcgggacctgtccctggcctcctgc base pairs 
caggaaccggcgggagcggaacagccgccaccgcaggagacacgacaccgagccctggacagggaccggaggacg 7801 co 7875 

Avail EcoRII 

Haelll 

ScrFI 

Hhal EcoRII Taqi 

gcggcctccctgccggcgtcggtgtacccgcccgcctcctgcctggtcacgtcccccgcctccctcgatcgctcg base pairs 
cgccggagggacgaccgcagccacacgagcgggcagaagacggaccagcgcagggagcggagggagctagcgagc 7876 co 7950 
Haelll BstNI Sau3AI 



Haelll Haelll Sau3AI TaqI Haelll 

tgtgcctcggcggcctccttcggccgtcgctgatctccttctcggtggtcctctccgtcgaggccgaagacactc base pairs 
acacggagccgccggaggaagccggcagcgactagaggaagagccaccagaagaggcagctccggcctctgtgag 7951 to 8025 

ScrFI 
EcoRII 

gtcaccgcgacgccatcgecgttgagcctggctctgataccatgtggatttttccggaatgtggaaaacatacag base pairs 
raataaeaetaea otaqcQaeaaetgoaae eoaaaetacaQtaeaeetaaaaaQacettaeaecttttatatote 8026 to 8100 
lliR BstNI 

Msel Haeiri Haelll 
caccctctctacaccacacacacttctagtatatgtgtacacgttaatgggccaacactagacacatggcccaac base pairs 
gtgagagagatgtggtgtgtgtgaagatcatatacacatgtgcaattacccggttgtgatctgtgcaccgggttg 8101 to 8175 

7F 

aqeatatcaaotqaeataacac tcacatttqtcatqqcaqQttatc aattctttqqtotccaCaagtqtgQaCqa base pairs 
tcgtacagtccaccgtatcgcgagtgtaaacagtaccgcccaacagctaagaaaccacaggcactcacacccacc 8176 co 8250 

Haelll 8eF 

cgacaagcataaecctaqacgtgtttttettt ptaaqaaiaaaaaacgagaatatQg ttaaticacattateteeaa base pairs 
ggtgttegtaccgggatccgcacaaaaagaaagactccttttcscgctgtcacacgaagtaacgcaatagaggtt 8251 co 8325 

8eR 

Sau3AI PstI 
ggtcaagatagcccatgtcgacccaaatgtaagtttgctgcagtttgctgagagctctgtggtttcgctatacac base pairs 
ceagctptateaqacacaaetagacttagattgaaacaacatcaaa caactcccoa aacaccaaaagaatacara 8326 to 8400 

5RN 
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Continued ^amHi pvuii 

ataatgtccctgactaccattgcttcgtcgcccacccgccccagatggacccaaaagccaaggctcagctgatag base pairs 
caccacaaagaccgacggtaacaaaaca^cggacgaacggaacccacccaggtccccggtcccgagtcgaccatc B401 co 8475 

Sau3AI 



Msel 

ccccccgcaccaccttctctggctgactagccgaatgcagtcagctctgccaaagagtcaaacacacgagccgtc base pairs 
ggaagacgcagcagaaaaaaccaaccgaccgactcacgccaatcgaaacggccccccaacccacgcactcaacaa assi co S625 

TaqI Msel Msel 

cctgcactcgaaaagggatgccaataatgcccacaaactccgaaaatgtatttttagacacttaactcgtcaagc base pairs , 
ggacgcgagctcctccccacagttatcacaggtgcttgagacccctacacaaaaatccacgaatcgaacaaccca 8626 Co 8700 



TaqI Seq2FN 

raaCgargcgCtaattatatattaQacatcctQCCtQtccQaaaoccaatct acacaaacaqc etatgtaaCqCa base pairs 
gccaccgcacaaccaacacacaacccacaagacaaacaagccctcgatcagacgcgctcgccgaacacaccacac 8776 to 8850 

Hindlll Haelll 
aaacctcaaacaaacctgcccccccacaagcttaggtttacaggatcagcgtttagtgcatgtaaggcctacttg base pairs 
tttggagctcgtctgaacggagaagtactcgaacccaaatatcctaatcgcaaaccacgtacattccggataaac 8851 co 8925 

BscNl ScrFl scrFi 

Haelll Sad EcoRII TaqI EcoRII 

cttcacggcccccctgccgagcccccggccagacagccaccccggccgcaggcgcccgaaaccgaacacccggga base pairs 
gaagcgccggagggacggcccgaggaccgatctgtcggtaggaccggcacccacgggccctagctcgcggacccc 8 926 co 9000 
EcoRII BscKI BscNI 

ScrFl Haelll 

ScrFI 
EcoRII 

gccacgtccgcaccagcaggcctcccCgggcgcaaaccaaacacgcctacagCgtccaagcacaaccgaaccggc base pairs 
cggtgcaaacgcgaccgcccaaaaggacccacgcccggcccgcgcggacaccacaagcccacaccgacccaacca 9001 Co 9075 
BSCNI 

Msel 

gctcacctttgtctaacagctcaagtctttggtcctcatcggtgcacgcaactccacactcaacagtcaatatga base pairs 
cgagcggaaacagatcatcgaactcaaaaaccaaaagcagccacgcacgtcgaggcatgagtcaccagccatacc 9076 Co 9150 

BstNI 

Xhol Hinfl' 

Cacagcgcccaagcacagaaccctcgagctcgaatcccggcaggggcaaccaaacaaaacaaccgcagcccaccc base pairs 
acaccacaagcccgcaccccgagagcccaaacccaggaccgcccccgccagcctaccttaccaacgccgaacggg 9151 Co 9225 
TaqI EcoRII 
ScrFI 

S3iF 

ccacctgcaggcte QaacacataaaqQaaaacottQ aattataaQCqcgctctccaccctcctccaacaoacoaa base pairs 
gacaaagacgcaaactcgcgtaccccccctcacaacccaacacccacacaagaggcagaaagagaccgcccaccc 9226 co 9300 

Hinfl Msel Msel 

ctggcccgcgcacgcaacccaacacgacacccgagccaaacgcccaccccaaaaccacagccgacgcaacccaac base pairs 
gaccaaacacgcacatcgagccacaccacaaacccagcccacaaacgaaatcccagcaccaaccacgetaaacca 9301 co 9375 
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Continued 

aacatattcctctggtcccgcgtgagggagcgtacgtataacc.gaact9cacacattcccttacagcctaggttt base pairs 
ccgtacaaaaaaaccagagcacactcccccacatgcacattgacttaacgtgtgtaaaggaacaccgaatccaaa 9376 Co 94S0 

Sau3AI 

ttgactgcaactgccggtgcacgtagctcaacaaccaaagtcgacctggacagcccacagtgaataagtttgaca base pairs 
aactgacgttgacaaccacgtacatcgagttattgactccaactagacctgccagacgccactcatccaaactgt 9451 to 9525 



cttgcaaaatgcgcatgtatttttacaaacgctggcaccttttccctaatagaaaatgggcagccaggcagtgat base pairs 
gaacatctcacacgcacacaaaaatgcttgcgaccgtgaaaaaaggaccaccccctacccgccagcccgccacca 9526 co 9600 

Sau3AI 

accgcttcgggcatttcttctgatgatgtggacctggagacgtcatctagtacgccaacgaggacagcaaccctt base pairs 
tggcgaagcccacaaagaagaceaccacacctagacctccgcagtagaccatacggttgctcctgtcgttgggaa 9601 to 9675 

Sau3AI 

cttgatctgtactctggctgtgggggcatgtctactggtctttgcttgggtgcagctctttctggcttgaaactt base pairs 
gaactagacacaagaccgacacccccgcacagacgaccagaaacgaacccacgtcgagaaagaccgaactttgaa 9676 Co 9750 

Sau3Al 

gaaaccgcaaccctctaactagccacccgttggatagaacacgttcacgacctcagaacctaccccactgtcccg base pairs 
cttcgacactagaagattgatcagtagacaacccatcctatacaagcgccagagccttgaataagataacaagac 9751 to 9825 

Msel 

gctcgcagcgatgggccgttgacttcaacagccttgcgtgccaaagtttaaaatataatcatccacagactgagg base pairs 
cgaacgccgctacccgacaaccaaagctgtcaaaacgcacggtttcaaactctacaccagcaggcgtctgacccc 9826 Co 9900 

Hinf I 

catggatagtaaactccaccttggattccatctgctctgccagccactcctacaaagtgcctggactcccggatg base pairs 
atacctaccacctgaagtagaacctaaggcagacaagacagccgacgagaatgctccacagacccaaaaacccac 9901 Co 9975 

Msel 

taggCgcgaaacgagaaagccgaCgagtcccCcgccccccccaaggaacgggcagCCccacgcaaaaaacacgcc base pairs 
acecacgccccacccccccggctacccaaagaacgggaggaattcctcacccgccaagacacgttcttcatacag 9976 to lOOSO 

Hinf I Sau3AI 
caagatgcggacccaaacctagcaagcccagaggatcaagcggacgaagacagccctcctgacaaggacgaatcc base pairs 
g(-t■■-^af^arg^aa aC1:taaaeeaf■eeaaatetcetaQ tt.caccCacttct.QtcaQQaQaaccqCCcctQC^:taaa 10051 Co 10125 
Seg2RN 

Hindi I I Msel 
gctgtagagaagcttgccgggacacgttacggtggcagcgacagggaaaacggcacctattctaaggtacctcag base pairs 
caacatctctccgaacagccctacacaataccaccgtcaccgcccctttcaccgcagataaaatcccatgaagcc 10126 to 10200 

Hinfl Msel 
tgtcatttgttoatctccacttgactccaacaaaaaaatcaattacttaagcccgtcaaacgatggatatctccg base pairs 
acagcaaacaagtaaagatgaaccaaggccgttcccccagctaacgaatccggacagtttgccacctataaagac 10201 to 1027S 

P8CI Haelli 
tatatcttgccgtaacgccagacttctgcaggtccagtgggaaggacacggccctgaggaggatacatgggaacc base pairs 
atacaaaacgacatcgcgacctaaagacgtccaggtcacccttcccatgccgggacccctcccatgtaccctcgg 10276 to 10350 
Avail 

gattgataacctgaggttagtgtatggtatatcgcctgcttgttgccttgtatacctactcgcatctaatccttg base pairs 
/'i-aaiTafrgaarfppaaCraeaf area fata ogaoaeaaacaagQaaacaCataa ataaacQCaoaetaQQaac 10351 CO 10425 
SeqlRJJ 
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FIG. 5 
Continued 

ttccgcaaaccagtgactgcccgcagaaaatcagagaacctgtacaagaagggcacaaaagaaagacccccccac base pairs 
aaaal-grr^gg^■eagt QacgagcQtc:ttttaatccc;t llaaacatqttctteccc;t:attttctt:tcl:aaaagQgtq 10426 Co 10500 
S3eR 

EcoRV 

cgcctgtgagtatctagcccgctgtgaccctgctcgccatttgttcagctccccttttttatctggcgatatctg base pairs 
acggacacccacaaatcaagcaacaccaaaacgagcgacaaacaaaccgaggggaaaaaacaaaccaccacagac 105O1 Co 1057S 

Haelll Msel 
cctacttcattctcccaaagggtgatgctgatgtcatttgtggaggcccaccatgccaaggtatcagtgggttta base pairs 
ggacaaaacaagaaagcttcccaetacaactacagcaaacacctccgggtggtacggttccatagtcacccaaat 10S76 co 10650 

3F 

at-.eqa tagaaaaacegtgacaaaccac tcaaaqataagaaaaacaaacaaatggtaactttcacqqatattgtqq base pairs 
ragrrargrperr.ga cacCactcoqcaaqcttctac teCttttqttcqtttaccacCQaaaqtacccataacacc 10651 CO 10725 



Hhal IF 
aai-argf?Frt:gagf:cgccctattgetataaaqCaccaaoeqeqqcctqqaacqatgqtooc tagetact:atqqCc base paii 
ctatacgaaactcgacggaacaacgacacctcatggctcgcgccgaaccccaccaccaccgaccaacgacaccag 10801 co 



Race2B 

ScrFI 

Hinfl EcoRII 
catgcttcgccagacccatatcgcaccgccggccgccggccaaccaggtgcacgcgcacccgacaatctaggege base pairs 
atacgaagcgatccaagcacaacgtgacaaccgacgaccgaccggcccacacgcacacaaaccgtcaaacceaeg 10951 Co 11025 

BsCHI 

RacelA 4F 

roontrnnatiati - oo cccgcccaactatgatqtcat aocaeqtaaaqaaqeeectaacgcgtttcgqtqaqcacaat base pairs 
nfjgg . - i i-f pn i-ngqaaacogat.tqatactaeaacaCc atqcacceccCcag9gattacqgaaaqccacCcacqCCa 11026 to 11100 
RacelB RaceRT 

cacaaaccaccaccatgaaaccatgtggaacgtgtaaaatacgctgaccaaccgaactcgctgcagcaacgtacg base pairs 
gtgtctggcgatgacaccccagcacaccCtacacaccccatgcgactggctgacccaaacaacgtcgccacaCac 11101 co 11175 

gttgcacatgacgagacacaaaaaccacccctgaaaaaagccttgcctctcggcgacgcaacttcagacttacca base pairs 
caacgtatactgctctgtgtctctggtagggacttttctcggaacgaagaaccgccacgccaaagcccaaatggt 11176 co 11250 

Mael PstI 

aaggcaagcgctccgtcaagttcacgcactcctcagcgagcacgccacctaacccccccccgcaggcccaaaacc base pairs 
ctccgctcacaagacagctcaagtacgtaaagagtcacccgcacgacaaactgagaagagacgcccaagctccag 11251 co 11325 

EcoRI Hhal Taqi 

accagcccaacgatgcgacggagtatggcggtcccccaagaccgaattccagcgctacattcgactcagccgcaa base pairs 
Cggccggatcgctacaccacctcacaccaccaaggggctccggcccaaggccgcgacgcaagctgagccagcatc 11326 to 11400 

Hinfl 

Haelll Msel 
aggtaaaaaaccccgtgaactaccactggctggcccccaccacgaacatgccaggatctaacctcagaagaaccg base pairs 
cccatccctcggggcacttgatgacgaccaaccggaagtgatgcccatacaatcctaaattaaagccttcctggc 11401 co 11475 

PstI Avail 
cctttcttttcccggcgctccggtactactgcagcaagctcactctcaccaccacgccagacacgccggactggt base pairs 
ggaaaaaaaagaaccacgaagccatgatgacgtcgttcgagtgagaataatagtacagcctgtacaacctaacca 11476 to 11550 
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FIG. 5 

;^vaii Continued sauSAi Msei 

. ccttcggtgaagggggccggcccagacgaaggcaagctcttggatcaccagccttcacggcttaacaacgacgat base pairs 
ggaagccacctcccccgaccaggcccactcccgtccgagaacccagcggccggaaatgccgaactgccgctacca 11551 to 11625 

Hinfl 

catgagcgggttcaacagatccctgtcaagaaggtcggCQgcttggtcgcacccgtgcctccccccgttgCCttt base pairs 
acacccgcccaagttgtctaaggacagcccccccaaccaccgaaccagcgcaaacacggaaggaaacaacaaaaa 11626 co 11700 



BamHI TaqI 

aagcaaacaatattgtcgagtgggatccagaaaccgagcgtgtgaaacttccacccgggaaaccactggtacgcg base pairs 
cccgtttgtcacaacaactcaccctaggcccctagcccgcacaccccgaaagtagaccctttggtgaccatacac 11776 to 11850 
Sau3AI 



Hinfl 

tgctgcaactactgtaagattcatggctaacccatgacaacattctgcacacatccttgctacctaggttcccga base pairs 
acgacgctgacgacatcctaagtaccgatcgggtaccgtcgtaaaacgtgtgtagaaacaatagatccaaggact 11926 co 12( 

ctacgcaatgtcactcaccaagggcaaatcactcaagcaagtcccaaaacattctctccgcttcctgggggaaaa base pairs 
gatacgctacagtaagcagctcccgcccagcgagcccatccaaagctccgcaaaaaaaacaaaaaaccccctttc 12001 co 12075 

Haelll Hhal 

gtaggttactgtccactcgtgcttacatatgacgctgcaggccgtttgggcgcctgcggtgggacaagacagtcc base pairs 
cacccaataacaaatgaacacgaatgtatactacaacgtccggcaaacccgcggacaccaccctgttccgtcaag 12076 Co 12150 

ScrFI 

EcoRII Haelll 
ctacagttgtaaccagagcagagccccacaaccaggccagcttcagaaaggccactccttttcgccaatccctgc base pairs 
gatgtcaacaccggtctcgtctcggagtgccggtccagtcgaagtcttcccggtgaggaaaagcggttagggacg 121S1 Co 12225 
BSCNI 

Sau3AI 

atctgtacttactattagcgtgtgtccccatatgaccatcaccgaacacgttgtccacacaggttataactcatc base pairs 
tagacataaatgataatcgcacacaagggtatactagtaatggcttgtacaacaggtgtgtccaatattaagtag 12226 to 12300 

ScrFI 

Hinfl ECOO109I Hpall 

cgactcaagcaagggtccccaccatccgggagaacgcaaggtcacagggcttccccgatcaccaccgaccgtccg base pairs 
gctgagttcgctcccaggagtgataggccctcttgcgttccaatgtcccgaaggggccaacaatggctaacaaac 12301 to 12375 
Avail Mspl 

Haelll Sau3AI 
gcccgatcaaggagaagtaagcccctgcccccaagccgcctgcaccagacccagccactaccgaaagtctccagc base pairs 
cgggccagcccccccccacccaaggacaaaagcccaacggacacggcccagaccagcgacaacccccaaaagccg 1237S to 124S0 
Sau3AI Bglll 



Mspl 

Hpail EcoRir 
aggtacattcaagtcgggaacgcagtggctgcccccgtcgcccgggcactgggctactgtctggggcaagcctac base pairs 
tccatgtaagttcagccctcgcgtcaccgacagggacaacgggcccgtgacccgacgacagaccccgttcggatg 12526 to 126C 

ScrFI 
Smal 
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scrFi FIG. 5 

Hinfi Pvuii Continued 

ctgggtgaacctgaggggagtgaccctccgtaccagccgcctccaagtttcacctccgctggaggacgcactgcg base pairs 
gacccactcagactccccccactgggagacacggccgacggaggctcaaagtggagacaaccccctgcgtgacgc 12601 Co 12675 

BstNI 

ECO0109I PstI Sau3AI 

gggcaggcgagggcctcttcctgtcggcacccccgcaggggaggtagccgagcagtaaaaggacgacagatccga base pairs 
cccQt:ccQCtcccqaaqaaqqacaacccitqqqaacqtcccctccatcaactCQt:cat ttt:c:eragi:acgt:aaapr 12676 Co 12750 
Haelll IR Bglll 

TaqI 

gccgagccgggcaacatccagcggcaggagcatttctggcccggttcgacccgggctcacga base pairs 
cgaitcgfliccgccgcaggtcgccgtccccgcaaagaccaagccaagctaagcccgagcgcc 12751 co 12812 

Hinfl 



SUBSTITUTE SHEET (RULE 26) 



wo 00/53732 



PCT/USOO/06456 




SUBSTITUTE SHEET (RULE 26) 



wo 00/53732 



PCT/USOO/06456 



21/39 
FIG. 7 



- 6000 bp ► 



- 2500 bp ► 

^ 1000 bp ► ^ 
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FIG. 8 




ii ii lili lili lili liii lili lili 
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FIG. 9 





SAM binding 


Cytosine binding 


Motif 


\V\.rina\ 


zm6t28 


IVI.rf/7ai 


zm6t23 


1 


Ph©1 8 


Tr\i'XA7 

I ryo'tf 






II 


Glu40 


Gln407 






Trp41 


Trp408 






III 


Asp60 


Asp428 






IV 


ProSO 


Pro516 


Phe79 


Pros 15 


Gln82 


Gln82 


Cys81 


Cys517 


V 


LeulOO 


Val542 






VI 






Glu119 


Glu559 






Asn120 


Asn560 






Van 21 


Val561 


VIK 






Arg165 


Arg605 


X 


Asn304 


Asn851 
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FIG. 10 10 A gkbribosomalrepoal 
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FIG. 12 



GENOTYPE 


NUMBER 
OF PLANTS 


TOTAL 
SmCytosine 
(%) 


%WT 
levels 


% 

decrease 


wild type 


3 


34.40 + 0.55 


ICQ 


0.0 


heterozygous zmet2a-mu1 


7 


32.00 + 0.90 


93.0 


7.0 


homozygous zmet2a-mu1 


5 


30.40 + 0.19 


88.4 


11.6 
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FIG. 16 



5' LTR 

catgcTGTTGGGCCATGTGTCTAGTGTTGGCCCATTAACGTGTACA 
CATATACTAGAAGTGTGTGTGGTGTAGAGAGAGTGCTGTATGTTTT 
CCAC.-.TTCCAGAAAAATCCACATGGTATCAGAGCCAGG 
PBS 



3' LTR 

PPT 

GAGGGGGAGTGTTGGGCCATGTGTCTAGTGTTGGCCCATTAACGTG 
TACACATATACTAGAAGTGTGTGTGGTGTAGAGAGAGTGCTGTATG 
TTTT CC AC ATT C CAG AAAAATCCACAca t g c 
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FIG. 17 



SPRITE-l ■ 
hopscotch ■ 
retrofit • 
arabpoiprt- 
copia 



SPRITE-1 - 
hopscotch ■ 
retrofit ■ 
arabpoiprt- 
copia 



SPRITE-1 ■ 
hopscotch • 
Retrofit ■ 
Arabpoiprt- 
copia 



Gag 

CYNCGNVGHIARNC 
CQVCSRVGHTALNC 
CQVCFKRGHTAADC 
CSNCGRTGHEKKEC 
CHHCGREGHIKKDC 



Proteasa 

TQVTQLKWILDSGASKH 
QNGSNVPWYTDTGATDH 
SYGIDTNWYIDTGATDH 
GKTKLGDIILDSGASHH 
SVMDNCCmDSGAS DH 



Intiegrase 

QVKILRPDN-GTEYVNKGFNAFLSRNGILHQTSCPDTPPQNGVAERKNRHILE 
KIIAFQSDW-GGE~YEKI.NAHFKTIGIHHQVSCPHTHQQNGAAERKHRHIVE 
KII AMQTDWRGGR— YQKLN S FFAQIGL I IMCHVLTLI RQNG SAERKHRH I VE 
TVKMVRSDN-GTE-- mCLSSYFRENGI IHQTSCVGTPQQNGRVERKHRHILN 
KWYLYIDN-GRE YLSNEMRQFCVKKGI SYHLTVPHTPQLNGVSERMIRTI TE 

Baverse Transcriptase 

RYKARLVARGYSQ'nGIDYDETFRPVJ\KMSTVRTLISCAANFG«PLWLDVKNAFLHGDLQEEVYMEIPPG ( 59) AILAVYVDDIII 
RlKARLVAKGFKQQYGIDYDDTFSPWKHSTIRLVLSLAVSQKWSLRQLDVQNAFLHGlLEETVlfMKQPPG (59) lYILVYVDDIII 
RYKARLVAKGFKQRYGIDYEDTFSPWKAATIRllLSlAVSRGWSLRQLDVQNAFLHGFLEEEVYMQQPPG(59)MFVLVYVDDIIV 
RYKARLVVQGNKQVEGEDYKETFRPVVR»mvRTl.l4RNVAANC|WEVYQMDVKNAFLHGDl.EEEVYMKLPPG (59)LRVLIW^ 
RYKARLVARGFTQKYQIDYEETFAPVARISSFRFILSLVIQYNLKVHQMDVKTAFLNGTLKEEIYMRLPQG (59) lYVLLYVDDWI 



BKase H 

SPRITE-1 - DRDMGSCIJDRRSTSGYCVFVG6-«l.VSWRSKKQSWSRSTAEAEYRAb»LAICEMlWIKGIi(25)NPVQHDR^ 
hopscotch - DADWAGCpODRKSTGGYALFI^P-tn.lSVIieKKQSTVSRSSTEAEYKAMAtATAEVIWLQSLLU5)KPIFl«BTKHIEVDrHr 
retrofit - DADWAGSlODRKSToGFAVFLGS-NLVSWSARKQprvSRSSTEAEYKAVANTTAELIWVQTLI. (25)NPVFHARTKHIEVDYHF 
arabpoiprt- DSDWQSCPLTRRSISAYWIJU;G-SPlSWKrKKQDrVSHSSAEAEYRAMSYALKEIKHLRKI.I,(25)NPVFHERTKHIESDCas 
copia - DSDWAGSEIDRKSTTGYLrKMFDFtlLICWNTKRQNSVAaSSTEAEYMALFEACREALWLKFLL (25)NPSCHKRAKHID1KYHF 
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FIG. 18 
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FIG. 20 



A B 
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FIG. 23 



GGGAATTCGATTACTCACTATAGCGCTCGAGCGGCCGCCCGGGCAGGTTCGAAAACCATC 
AACCTAACGATGTAATGGAGTATGGTGGTTCCCCCAAGACAGAGTTCCAGCGCTACATTC 
GACTTGGTCGTAAAGACA TG TTG GA TTG GTCG TTTG GTGAGGAGGCTGGTCCAGATGAAG 
GCAAGCTCTTGGATCACCAGCCCTTACGGCTTAACAATGATGATTATGAGCGGGTTAAGC 
AAATTCCTGTCAAGAAGGGAGCCAACTTCCGTGACCTAAAGGGTGTCAAGGTTGGAGCAA 
AT A AT G TTG TTG A GTGGGA TCCA GAAGTC GAA CGTGTGTACC TTTC GTCTGGGAAACCAC 
TGGTTCCTGACTATGCGATGTCATTCATCAAGGGCAAATCACTCAAGCCATTCGGGCGCC 
TG TGGTGGGACCA GA CGG TTcCTA CA G TTG TGA CCA GAGCA GAGCCTCA TAA CCAGGTTAT 
ATTGCATCCGACTCAAGCAAGAGTCTTGACTATCCGGGAGAACGCAAGGTTACAGGGCTT 
CCCCGATTACTACCGATTGTTTGGACCGATCAAGGAGAAGTATATTCAAGTCGGGAACGC 
AGTGGCAGTCCCTGTTGCACGGGCACTGGGCTACTGTCTGGGTCAAGCCTACCTGGGTGA 
ATCTGACGGGAGTCAGCCTCTGTACCAGCTGCCTGCAAGTTTTACCTCTGTGGGGCGAAC 
CGCGGTTCAGGCGAATGCCGCTTCTGTTGGCACTCCTGCGGGGGAGGTAGTCGAGCAGTA 
AAAGGATAGCGGAGCAACCCTGGTTGGTATTTTGATTCGAGCCCATCCAGTAGCATGTTT 
ACCAATAAATAA TCA TTG GTCGTGC TGA TTC TTA TG G TTG GA GA TGA A TGTATGTAGGGT 
GTACTCGAGCTCGAGTGCTTGTTGTACTGTAGGTTGAGGTTTCTCATCCATTGGCCTGCC 
TA TTTG TGGATGACG TTTC A TTTC A GATTAGCAATGTGCTTATTTAAGG TTTC G TCA TG T 
ACCTGTATTCTACAATCCACTATTGTTTCCAAAGACAGCATTTGATCCTTAAAAAAAACT 
GTAAAAAAAAAAAAACAGTGCCCGAAAAGCCGCAAAAAAAAAAAAAAAAAAAACCTGCCC 
GGGCGGCCGCTCGAGCCCTATAGTGAGTAA TCGAATTCCC 
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FIG. 24 



FFDVSL^RSSGRPGRFENHQPNDVMEYGGSPKTEFQRYIRLGRKDMLDWS 
FGEEAGPDEGKLLDHQPLRLNNDDYERVKQIPVKKGANFRDLKGVKVGAN 
NVVEWDPEVERVYLSSGKPLVPDYAMSFIKGKSLKPFGRLWWDQTVPTVV 
TRAEPHNQVILHPTQARVL TIRENARLQGFPDYYRLFGPIKEKYIQVGNA 
VAVPVARALGYCLGQA YLGESDGSQPLYQLPASFTSVGRTAVQANAASVG 
TPAGEWEQ' 
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667 KVONHQPNDVMEYGGSPKTEFQRYIRLSRKDMLDWSFGEGA GPDEGKLLDHOPLRLNNDD 726 

+ -^NHQPNDVMEYGGSPKTEFQRYIRL RKDMLDWSFGE AGPDEGKLLDHQPLRLNNDD 
15 RFENHQPNDVMEYGGSPKTEFQRYIRLGRKDMLDWSFGEEAGPDEGKLLDHQPLRLNNDD 74 

727 YERVQQIPVKKGANFRDLKGVRVGANNIVEWDPEIERVKLSSGKPLVPDYAMSFIKGKSL 786 

YERV^QIPVKKGANFRDLKGVi- VGANN-^VEWDPE+ERV LSSGKPLVPDYAMSFIKGKSL 
75 YERVKQIPVKKGANFRDLKGVKVGANNVVEWDPEVERVYLSSGKPLVPDYAMSFIKGKSL 134 

787 KPFGRL WWDETVPTVVTRAEPHNQ VIIHPTQARVL TIRENARLQGFPDYYRLFGPIKEKY 846 

KPFGRLWWD+TVPTVVTRAEPHNQVhHPTQARVLTIRENARLQGFPDYYRLFGPIKEKY 
135 KPFGRLWWDQTVPTVVTRAEPHNQVILHPTQARVLTIRENARLQGFPDYYRLFGPIKEKY 194 

847 IQVGNA VAVPVARALGYCLGQA YLGESEGSDPLYQLPPSFTSVGGRTAGQARASPVGTPA 906 

IQVGNAVAVPVARALGYCLGQAYLGES+GS PLYQLP SFTSV GRTA QA A* VGTPA 
195 IQVGNAVAVPVARALGYCLGQAYLGESDGSQPLYQLPASFTSV-GRTAVQANAASVGTPA 253 

907 GEWEQ912 

GEWEQ 
254 GEWEQ 259 
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